perm filename RIPPLE[DIS,DBL]1 blob sn#213800 filedate 1976-05-02 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00007 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.ASEC(Rippling to Locate Relevant Heuristics)
C00006 00003	.ASSEC(The Domain of Applicability of a Heuristic)
C00008 00004	.ASSECP(Efficiently Locating Specializations)
C00020 00005	.ASSECP(Efficiently Locating Examples)
C00033 00006	.ASSECP(Efficiently Locating Relevant Heuristicss)
C00042 00007	.ASSECP(Independence Assumptions)
C00049 ENDMK
C⊗;
.ASEC(Rippling to Locate Relevant Heuristics)

This appendix
discusses in excruciating 
detail how AM is able to zero in  on the relevant heuristic rules,
once a task has been selected from the agenda.

Each heuristic H has a domain of applicability of the form "H applies to
all examples of C". The author identified the proper concept C for each heuristic
AM possessed, and tacked that heuristic onto that concept C.
This was all preparation. As AM is running, it will choose a task dealing
with, say, concept X. 
How does AM locate  the heuristic rules which are relevant to X:
by a process called ⊗4rippling upward from X in the Isa direction⊗*.

.ASSEC(The Domain of Applicability of a Heuristic)

AM contains hundreds of heuristics. In order to save time (and to
make AM behave more rationally), each heuristic
should only be tried in situations where it might apply, where it makes sense.

The key observation is that a heuristic typically applies to
"all examples of a particular concept C". 
In general, then, we tack each heuristic to the
concept C whose examples it applies to.

So the whole problem of locating relevant heuristics has been reduced to the 
problem of efficiently finding all concepts of which C is an example
(for a given concept C). This relationship will be denoted "ISA"$$
i.e., "C ISA X" iff "C is an example of X"
iff "X ε ISA's(C)". $.
As we shall see shortly, the
task of locating all ISA's of C 
is closely related to the efficient location of all generalizations and
specializaiions of a given concept.
The next subsections will deal with those problems: how AM quickly assembles a list
all the concepts which are generalizations of (or examples of, or
specializations of, or ISA's of) a given concept.
This process will be known as "⊗4rippling⊗*" away from the given concept,
in the Generalizations (or the Specializations, or the Isa's, or the Exs) direction.

.ASSECP(Efficiently Locating Specializations)

In the last subsection, we saw how nice it would be to be able to tell
quickly what all the generalizations, specializations, isa's$$
"x ISA y" iff x is an example of y. The ISA link points from
a concept x to all concepts y  which claim x as an example.$, and
examples  of a
given concept are. Below is the scheme AM uses to do this.

The most obvious scheme is to store this information explicitly.
So the Examples facet of C would point to all known examples of C, etc.
But there is one crucial (though simple) idea that prevents this:
One can substitute a modest amount
of processing time (via chasing links around) for the vast amount of storage space
that would be needed to have "everything point to everything".
We'll cover this idea slowly, although many readers will already
grok this, and can just skim this section.
Many other readers won't care about this space-saving idea, and may wish to
skip this and move immediately on to the next chapter.

One of the facets a concept may have is "Specializations".
For example, the Specializations facet of Numbers might be the list
(Primes Odds Evens Squares Maximally-divisibles Perfects Amicables).
The Specializations facet of Primes might point to Odd-primes, and ⊗4its⊗*
Specializations facet might contain the entry "Mersenne-primes".

And yet Mersenne-primes are clearly a specialized kind of Number, just as
Primes are. Should the Specializations facet of Numbers point to
them as well? Can't we easily "deduce" that Mersenne-primes are a specialization
of the concept Numbers?

To deduce that, one must realize that a specialization of a 
specialization of X is itself a specialization of X. If X.Spec denotes
all the entries on the Specializations facet of X, then 
(X.Spec).Spec  
will mean all  ⊗4their⊗* specialization entries, 
and will be written
X.Spec↑2.
The necessary deductive knowledge can be stated as: X.Spec↑n are
specializations of X (for any n).
The function Specializations(X) will be used to signify the results of this
repeated gathering: X.Spec ∪ (X.Spec).Spec ∪...
In fact, this coincides with our intutitive notion of what is -- and isn't --
a specialized kind of X.

Graphically, we may signify a Spec relationship by a 
downward slanting arc.
Here is a portion of the network of
Spec links to and from these concepts:

.B0 INDENT 25

Object
   \
    \
     \
      \
       \
	\
	Number
	 / \
	/   \
       /     \
      /	      \
     /	       \
    /		\
Odd-numbers    Primes
    \		/ \
     \	       /   \
      \       /     \
       \     /       \
        \   /	      \
  	 \ /	       \
      Odd-primes   Even-primes
	   \
	    \
	     \
	      \
	       \
		\
	 Mersenne-primes

.E	  

It is clear by inspection$$ Visually. It is slower -- but still
quite fast -- to compute this given a nongeometric representation: a 
list of nodes with
link fields. $ that Mersenne-primes are a specialization of Numbers.
Also, we see that Mersenne-primes is ⊗4not⊗* known to be a specialization of
Even-primes, since one has to go up and then downward again to travel
along the graph from one of those concepts to the other.
X is  a specialization of Y iff you can leave the node X and move
steadily upward (along the slanting arcs) to reach node Y.
Y is a generalization of X iff you can leave node Y and move steadily
downward to reach X. So Object is a generalization of ⊗4all⊗* the nodes
shown in the little graph above.

The diagram is a graphical presentation of the contents$$
Only a portion of the contents. This is all we need see at the moment, for
illustrative purposes. In later chapters, we'll look into these concepts in
greater detail and completeness. $
of the Specializations and Generalizations
facets of the concepts shown.
Each arc actually stands for two: each Spec arc from A to B
has a twin arc from B to A, labelled "Genl", for Generalizations.
Thus one picture can represent both networks$$ One says that the
nodes are doubly-linked. $. When one
traces upwards (downwards) from a particular node, he can reach precisely
the set of generalizations (specializations) of that node.

Let's state definitely what we mean when we say that
a concept A is a  ⊗4specialization⊗* of a
concept B: we mean that A could be defined as ⊗4"both B.Defn and P"⊗*, 
where ⊗4B.Defn⊗* is
some definition of  B, and  the ⊗4P⊗* is any predicate.
For example, the definition of Odd-primes can be stated as
"Odd-prime(x) iff both Prime(x) and Odd-number(x)". So Odd-primes is a
specialization of Primes, with P=Odd-number$$ Odd-primes is also a specialization
of Odd-numbers, with P=Prime. $.
Similarly, Mersenne-primes is a specialization of Numbers, since we can write
"Mersenne-prime(x) iff both Number(x) and (both Prime(x) and (both Odd(x)
and x is of the form 2↑n-1))". That last definition is of the form
"both Number(x) and [...]", so Mersenne-primes is a specialization of Numbers.

As we move down the graph, each node "adds" one more conjunct onto the definition
of the node above it. As we move from Primes to Odd-primes, the extra conjunct is
the predicate Odd(x). One can now see an efficient algorithm for finding out what
the unknown item x is. We start at the top of the graph, and run Object.Defn
on it. If it passes that test, we run the "additional" predicate that would make
it a Number. If that succeeds, we run the additional tests for Odd-numbers and
for Primes. In general, when x passes the test at node N, we run the tests of all
the nodes immediately below N. When this process terminates, all the nodes which
passed x can claim x as an example. This is cute, and efficient in some sense,
but it still requires a fair amount of processing to assemble the list of all
concepts of which x is an example. We can do this with much less processing, as
the reader will soon see.

We formally define the relationship 
"Generalization" to be the converse of Specialization.

To find all the generalizations of a concept C, we must follow all of C's Genl
links (all the lines
emanating upward from C) and collect a list of all the nodes they terminate in,
then follow
the Genl links from ⊗4those⊗* nodes, etc. This process will quickly halt$$
Each link points to a higher node, 
and there is a well-defined top node, "Anything", and there are no cycles.
The longest path from a node to Anything is only about log↓B(N), when there are N
nodes evenly distributed in a tree with B branches per node. In 
the current incarnation of AM,
the
averge length this path is only about 5. $.
We shall call this process "⊗4rippling in the Genl direction⊗*".
The end result of this rippling will be a list of all known generalizations
of concept C.

Analogously, we can "⊗4ripple in the Spec direction⊗*", tracing downward along the
Spec links, to recover all specializations of a given concept.
This is guaranteed to terminate only because the number of nodes is finite.

Now that we've evolved a fast way to find Specializations(X) and Genralizations(X),
let's see if we can do the same for Examples(X) and ISA's(X).
Even though this is cute and interestingin its own right, let's not forget that
the reason for all this work is to derive a fast way a computing ISA's(X),
since the heuristics relevant to C are precisely those tacked onto concepts in
ISA's(C).


.ASSECP(Efficiently Locating Examples)

.EXISA: PAGE

We're now ready to formally define what we mean when we say that A is an
⊗4example⊗* of concept B.
This relationship holds iff
A satisfies
the definition  of B. 
For example, "3" satisfies the first entry stored in the Definitions facet
of the Number concept, so it must be an example of a number. Similarly,
"3" is also an example of a Prime, an Odd, an Odd-prime, an Object, etc.

Notice that being an example of something is completely independent of
being a specialization of that thing:

.BN

λλ "Compose" is both a specialization of, and an example of, "Operation".
"Compose" is a valid example of an operation,  since it satisfies
the definition  of Operation. On the other  hand, compositions form a
subclass of  all  operations; to  be  a  composition, X  must  be  an
operation  and  also  must satisfy  some  stronger  requirements.  So
Composition is also a specialization of Operation.

λλ "Compose-Insert-and-Delete" is an example of "Compose", but not a specialization.

λλ "Compose-with-Self" is a specialization of "Compose", but not an example.

λλ "Sets" is not related to "Compose" via Spec, Genl, Exs, or ISA.$$
Of course they both have a common ancestor, Any-concept, but one has to
switch directions (up and then down again) to get from one concept to another
using only those four kinds of links. $

.E

We were able to make the following powerful statement about Specializations:
"The specializations of C are precisely those caught by rippling away from C
in the Spec direction$$
The specializations facet of C is denoted C.Spec. 
The entries thereon are called Specs of
C. The larger set Specializations(C) contains not only C.Spec, but all specializations
of concepts in C.Spec. Specializations(C) is obtained by rippling away from C in the
Spec direction; that is, C.Spec ∪ (C.Spec).Spec ∪...
$". Can we make an analogous statement for Examples?
The analogous one would be "The examples of C are precisely those caught by
rippling away from C in the Exs direction$$
The examples facet of C is denoted C.Exs. The entries thereon are called Exs of
C. The larger set Examples(C) refers to all known concepts which satisfy the
definition of C. Thus Examples(C) contains C.Exs plus many more concepts (probably).
$". This is beautifully symmetric, but
utterly false! 
If X satisfies the definition of Y, and Y satisfies the definition of Z,
then X may or may not (and usually ⊗4won't⊗*) satisfy the definition of Z.
Number satisfies the definition of Any-concept, and the atom "3" 
satisfies the definition
of Number, but "3" certainly does ⊗4not⊗* satisfy the definition of Any-concept.

After such a dismal failure, it is often helpful to  ask
(we may learn something by asking)
"Why did it work for Spec and not for Exs?"
For Specializations, it worked because if x$$
For the remainder of this subsection, X,Y,Z will represent any three concepts,
and x,y,z will represent their definitions. P, Q, R are variables for predicates.
Recall that a∧b means "both a and b". $
has the form
y∧P, and y has the form z∧Q, then x has the form y∧P=(z∧Q)∧P = z∧(Q∧P).
We're talking now about the definitions x,y,z of three concepts X,Y,Z.
So because of the transitivity of "AND", we can show that if
X is a specialization of Y, and Y of Z, then X is certainly a specialization of Z.

But wait! If X is an example of Y, and Y is a ⊗4specialization⊗* of Z,
that means X satisfies Y's definition, which means that X satisfies z∧P,
(where z is Z's definition and P is some predicate),
so X satisfies z plus some other tests, so X satisfies Z's definition, 
so X is an example of Z.
So we ⊗4do⊗* have a slightly analogous statement after all:
"Any example of Specializations(C) is an example of C".

This relation is not quite so symmetric as the first one; a heuristic in the
back of my mind urges me to examine the mirror image of this statement.
Namely, is it true that "Any specialization of an example of C is also an
example of C". Well, let's see. Suppose X is an example of Y, and Y is a
specialization of Z.
Since X is an example of Y, X satisfies y.
But y is of the form z∧P (since Y is a specialization of Z).
Thus X satisfies z∧P, so clearly X satisfies z. So X ⊗4is⊗* a specialization
of Z. So the statement is true.

We can put both of the above partial results together, into one necessary and
sufficient characterization of the (known, reachable) examples of a concept:

⊗6"Examples(C) are precisely the members of Specializations(Exs(Specializations(C)))."⊗*

What this means graphically is as follows. To find examples of C, start at node
C. Ripple as far as you want (not at all, if desired) in the Spec direction.
Then go in the Exs direction just once. Then ripple again in the Spec direction
as far as you wish.
Let's assume that a triple vertical line (⊗8α⊗⊗*) signifies a Exs link. Then
here is the graphical interpretation of the relationship just derived:

.B0 INDENT 20

C1
 \
  \
   \	⊗4any number of Spec links⊗*
   ...
     \
      \
      C2
       α⊗     ⊗4one Exs link⊗*
       α⊗   
      C3
       \
	\
	 \	  ⊗4any number of Spec links⊗*
	 ...
	   \
	    \
	    C4

.E

.GROUP

C1 might coincide with  C2, and C3 with C4. Using  the simple diagram
below, the only  examples of Operation which are pictured are Compose
and  Compose-Intersect&Intersect.  The  latter  is  plucked  in   two
distinct ways, in fact:

.B0 INDENT 20
 Any-concept
   \     α⊗
    \	 α⊗
     \	 α⊗
      \	 α⊗
   Activity
      /	\
     /	 \
    /	  \
   /	   \
Predicate  Operation
 α⊗         / \  α⊗
 α⊗        /   \ α⊗
Equal    /    Composition
        /       \
       /         \
Dom⊗1=⊗*Range-op     Compose-with-self  
       α⊗          α⊗			 
       α⊗          α⊗			 
    Compose-Intersect⊗1&⊗*Intersect

.E

.APART

Here is a list of all the relationships that can be derived from this
little graph, using the two rules:$$ RULE1: Specializations(x) = x.Spec↑n, and
RULE2: Examples(x) = Specializations(Exs(Specializations(x))). $

.BEGIN NOFILL SELECT 1 INDENT 0

Specializations(Any-concept) = (Activity, Predicate, Operation, Composition, Dom=Range-op,
				Compose-with-self)
Specializations(Activity) = (Predicate, Operation, Composition, Dom=Range-op, Compose-with-self)
Specializations(Predicate) = NIL
Specializations(Equal) = NIL
Specializations(Operation) = (Composition, Dom=Range-op, Compose-with-self)
Specializations(Domain=Range-op) = NIL
Specializations(Composition) = (Compose-with-self)
Specializations(Compose-with-self) = NIL
Specializations(Compose-Intersect&Intersect) = NIL

Examples(Any-concept) = (Activity, Predicate, Operation, Composition, Dom=Range-op, 
				Compose-with-self, Equal, Compose-Intersect&Intersect)
Examples(Activity) = (Equal, Composition, Dom=Range-op, 
				Compose-with-self, Compose-Intersect&Intersect)
Examples(Predicate) = (Equal)
Examples(Equal) = NIL
Examples(Operation) = (Composition, Dom=Range-op, 
				Compose-with-self, Compose-Intersect&Intersect)
Examples(Domain=Range-op) = (Compose-Intersect&Intersect)
Examples(Composition) = (Compose-Intersect&Intersect)
Examples(Compose-with-self) = (Compose-Intersect&Intersect)
Examples(Compose-Intersect&Intersect) = NIL

.E

So AM can get by with a total of 12 links (LISP pointers), to store a total
of 37 relationships (if the "everything points to everything" scheme were used).
In toto, the actual LISP implementation has about 400 links, implicitly
storing a total of about 5000 relationships (any of which can be recovered almost
instantaneously, by using the formulae for Specializations and Examples.

.ASSECP(Efficiently Locating Relevant Heuristicss)

Each  concept has facets  which contain some heuristics.  Some of
these  are for filling in,  some for checking,  some for deciding the
interestingness, etc.  For  example, one Interest feature  of Compose
says that a composition is more interesting if the range of the first
argument equals the domain of the  second, and that a composition  is
meaningless if those two sets don't even intersect.
One entry on the Interest facet of Operation says that an operation is
interesting if its domain and range coincide.

Suppose we want to  judge the interestingness a concept like 
"Compose-Intersect-and-Intersect"$$ The intersection of
3 sets. Look at the graph in the previous section,
to refresh your memory of how this relates to other concepts. $.
How can AM zero in on the relevant heuristics?
The key observation is: "A heuristic tacked onto C will be relevant iff
Compose-Intersect-and-Intersect is an Example of C".

Turning this around, we wish to compute ISAs(Compose-Intersect-and-Intersect).
That is, find all the concepts which claim it as an example. From our previous
work, we know how to do this:

⊗6ISAs(X) = Generalizations(ISA(Generalizations(X)))⊗*

Here is a nondeterministic way of performing that procedure:
Start at concept X, ripple upwards in the Genl
(Single-line link, opposite of Spec)
direction as much as
you please, go upwards once in the ISA (Triple-line link, opposite of Exs) 
direction,
and then ripple again in the Genl direction.
More graphically, we can show this as follows:

.B0 INDENT 6

                                 C4     ⊗4This concept claims X as an example⊗*
				/
			       /
			      /
			    ...	    ⊗4any number of "Genl" links⊗*
			    /
			   /
			  /
			C3	 
			α⊗
			α⊗     ⊗4One single "Isa" link⊗*
			C2
			/
		       /
		      /
		    ...     ⊗4any number of "Genl" links⊗*
		    /
		   /
		  /
		X     ⊗4The given concept⊗*
.E

Some of these concepts might coincide. 
The relation "generalization-of" is transitive (due to the transitivity
of "subset"), but the relation "is-an-example-of" is not transitive
(since "member" isn't transitive).

From the little graph on the preceding page, we can apply this formula to obtain:

.ONCE PREFACE 0 INDENT 0,30,4

ISAs(Compose-Intersect-and-Intersect) = 
(Any-concept, Activity, Operation, Composition, Dom=Range-op, Compose-with-self).

As AM ripples upwards
from Compose-Intersect-and-Intersect,
AM gathers heuristics along the way.  Several of these
have to do with why a  Composition of any kind might be  interesting;
several have to do with why any operation might be interesting, etc.
The two heuristics mentioned in the first paragraph are thus both located quickly.
Incidentally, both of them are satisfied, and say that this operation is interesting.
A few more general heuristics are encountered, tacked onto Any-concept, which
test some features that would make any concept interesting.

Instead of "judging interest", suppose we wish to "fill in examples" of that
bottom concept,
Compose-Intersect-and-Intersect.
How can AM find the heuristics that will result in the right kind of entries being
created/found/computed?
Again, the answer is to ripple upwards, gathering heuristics from each concept
encountered.  
Any heuristic tied to a concept in ISAs(Compose-Intersect-and-Intersect) will be
relevant.
For example, heuristics tacked onto Compose explain how to  use examples  of its
arguments to find examples of the composition. Heuristic rules tacked onto 
Operation explains how
to "run"  the concept on  randomly chosen domain  elements, to derive
examples that way.  Any-concept's heuristics  say to instantiate the  definition
of the  concept, to symbolicly  construct examples directly  from the
definitions of Compose-Intersect&Intersect.

In  general, the suggestions  of the  more specific concepts  will be
better than those of the general concepts. This is the old generality
⊗4versus⊗*  power tradeoff.   So  AM begins  executing the  heuristic
rules, rippling upward, and quits after it's used up its time quantum
(its activation energy) for the current task.  If  it doesn't make it
all the way up to  Anything, that's no great loss, since Anything has
such general information that it's almost never practical to use
it.  (e.g., "⊗6To fill in examples of  anything, pick random objects and
see  if   they  satisfy  the  definition⊗*").     In  production  system
terminology,  one may  view  ⊗4all⊗*  the  heuristics  accessable  by
rippling  as "triggering",  and the  specific-first policy  is simply
AM's  answer to conflict resolution (how  to decide which "triggered"
rule's right side actually gets executed next).

These linkages serve  another purpose, besides providing  an implicit
framework to guide the gathering of heuristic rules. Suppose one rule
asks for examples of Operations, during its course of execution.  How
can AM quickly find  all examples of operations?   Well, a reasonable
answer is  to ripple downward along the  Spec direction for some time
(perhaps not at all), then go down the Exs link (only once), and then
ripple along Spec again as far as you please.

So the links  are used for two purposes:

.BN

λλ To guide the  gathering of heuristics. (In
order to actually excute a task once it's been chosen from the agenda;
in order to judge the worth of a given concept; etc.)

λλ As efficient ways to gather  examples, specializaitons, etc.
of a given concept.

.E

.ASSECP(Independence Assumptions)

<< Should this sec. go in EVALU chapter? (Independence assumpts) >

A task  dealing with concept  X has been  chosen.  AM  ripples upward
away from X, gathering relevant heuristics.  They are executed in the
order they are encountered, so  that the heuristics applying to  very
general concepts  will be executed  last.   This is because  the more
specific a heuristic is, the more powerful it is ⊗4when it applies⊗*.
If a single concept  yields up several relevant heuristics,  they are
executed in the order they're stored on that concept.

No processing is done to re-order the set of all relevant heuristics.
There are two reasons for this:

.BN

λλ Empirically it wasn't  necessary; the specific-first ordering  was
always close to  the best possible  one.  It's pointless  to squander
cpu time.

λλ Theoretically, this group of heuristics can be run as a production
system, with no ordering  whatsoever. Conflict resolution should  not
be a crucial issue.

.E

The first reason holds because there  is weakness in generality.  
The rules tacked onto the specific concepts are almost always superior to those
tacked onto more general concepts.
The
second  reason  embeds  the assumption  that  the  heuristics  do not
interact with each other.  That  assumption of independence (or
"linearity", if you prefer) is quite dangerous, and is not absolutely
true. Sometimes the left-hand-side of  a heuristic rule will  trigger
(or not) depending on what other heuristic rules have just been tried
(or not tried).  For example, a typical left-hand-side might begin by
saying  "⊗6If AM  has  just  spent  at  least  3  seconds  trying  to
instantiate X.DEFN,...⊗*". 
Or a rule might be worded as "⊗6If ↓_any_↓ examples of X exist, Then a
new one can be gotten by...⊗*".
In this case, the rule might fail just because it was the first rule tried.
To avoid just this  kind of serendipity,
each rule is actually accessed twice.

There  is another kind of "linearity"  assumption which is dangerous,
false, and  yet we  can get  away with:  How can  we be  sure that  a
heuristic won't apply to a few scattered concepts, but not to any one
concept and its examples?   
That is, how do we know that each heuristic has a nice domain of applicability?
We are associating  each heuristic with  a
particular  concept.   How  do  we know  that  we shouldn't  have  to
associate  it in general with some random ⊗4subset⊗* of concepts?   
In fact, we do assume that
the
associated  concept  generates  the  subset  of  relevant   concepts:
namely, all examples of that concept.   This simplistic assumption is ⊗4not⊗* 
always true. For
example,  sometimes, by analogy, a certain  heuristic is used way out
its normal domain of  applicability. Then the domain consists  of two
isolated pieces of the graph of concepts,  two regions that
can never be  reconciled into one.   It is  untenable to
have to maintain all 2↑[200] subsets of 200 concepts, because of  the
extreme magnitude of that  number. The only answer for  this dilemma is again
empirical:  as a first  approximation, there are  very few heuristics
(and very  few  instances where  they  are  crucial) which  can't  be
sqeezed into the simple "linear" mold.
If a heuristic is really applicable in a couple isolated places, AM will suffer
a duplicate of it to be maintained. In the LISP program, no such duplicates
were ever required.

Another linearity,  not dealing  with heuristics, was  the assumption
that  the top  few jobs on  the agenda  rarely interact,  so it rarely
matters which  of them  goes before the  others, so  AM never  spends
cpu time trying to precisely order  them: it always picks the very top
one and leaves it at that.

All these  simplificiations,  these  "independence  assumptions",  or
linearizings, are the inklings  of a science -- at least,  of a model
-- of  mathematical research. They imply what  kinds of things can be
ignored, what factors don't couple, etc.